[% setvar title Provide a standard module to simplify the creation of source filters %]
<div id="archive-notice">
    <h3>This file is part of the Perl 6 Archive</h3>
    <p>To see what is currently happening visit <a href="http://www.perl6.org/">http://www.perl6.org/</a></p>
</div>
<div class='pod'>
<a name='TITLE'></a><h1>TITLE</h1>
<p>Provide a standard module to simplify the creation of source filters</p>
<a name='VERSION'></a><h1>VERSION</h1>
<pre>  Maintainer: Damian Conway &lt;<a href='mailto:damian@conway.org'>damian@conway.org</a>&gt;
  Date: 20 Sep 2000
  Last Modified: 29 Sep 2000
  Mailing List: <a href='mailto:perl6-language@perl.org'>perl6-language@perl.org</a>
  Number: 264
  Version: 3
  Status: Frozen
  Frozen since: v2</pre>
<a name='ABSTRACT'></a><h1>ABSTRACT</h1>
<p>This RFC proposes that the interface to Perl's source filtering facilities
be made much easier to use.</p>
<a name='DESCRIPTION'></a><h1>DESCRIPTION</h1>
<p>Source filtering is an immensely powerful feature of recent versions of Perl.
It allows one to extend the language itself (e.g. the Switch module), to
simplify the language (e.g. Language::Pythonesque), or to completely recast the
language (e.g. Lingua::Romana::Perligata). Effectively, it allows one to use
the full power of Perl as its own, recursively applied, macro language.</p>
<p>The Filter::Util::Call module (by Paul Marquess) provides a usable Perl
interface to source filtering, but it is not nearly as simple as it could be.</p>
<p>To use the module it is necessary to do the following:</p>
<ul>
<li><a name='1.'></a>1.</li>
<p>Download, build, and install the Filter::Util::Call module.</p>
<li><a name='2.'></a>2.</li>
<p>Set up a module that does a <code>use Filter::Util::Call</code>.</p>
<li><a name='3.'></a>3.</li>
<p>Within that module, create an <code>import</code> subroutine.</p>
<li><a name='4.'></a>4.</li>
<p>Within the <code>import</code> subroutine do a call to <code>filter_add</code>, passing
it either a subroutine reference.</p>
<li><a name='5.'></a>5.</li>
<p>Within the subroutine reference, call <code>filter_read</code> or <code>filter_read_exact</code>
to &quot;prime&quot; $_ with source code data from the source file that will
<code>use</code> your module. Check the status value returned to see if any
source code was actually read in.</p>
<li><a name='6.'></a>6.</li>
<p>Process the contents of $_ to change the source code in the desired manner.</p>
<li><a name='7.'></a>7.</li>
<p>Return the status value.</p>
<li><a name='8.'></a>8.</li>
<p>If the act of unimporting your module (via a <code>no</code>) should cause source
code filtering to cease, create an <code>unimport</code> subroutine, and have it call
<code>filter_del</code>. Make sure that the call to <code>filter_read</code> or
<code>filter_read_exact</code> in step 5 will not accidentally read past the
<code>no</code>. Effectively this limits source code filters to line-by-line
operation, unless the <code>import</code> subroutine does some fancy
pre-pre-parsing of the source code it's filtering.</p>
</ul>
<p>This last requirement is often the stumbling block. Line-by-line
source filters are not difficult to set up using Filter::Util::Call,
but line-by-line filtering is the exception, rather than the norm.
Since a newline is just whitespace throughout much of a Perl program,
most useful source filters have to make allowance for components that
may span two or more newlines. And that complicates the filtering code
enormously.</p>
<p>For example, here is a minimal source code filter in a module named
BANG.pm. It simply converts every occurrence of the sequence <code>BANG\s+BANG</code>
(which may include newlines) to the sequence <code>die 'BANG' if $BANG</code> in
any piece of code following a <code>use BANG;</code> statement (until the next
<code>no BANG;</code> statement, if any):</p>
<pre>        package BANG;
 
        use Filter::Util::Call ;

        sub import {
            filter_add( sub {
                my $caller = caller;
                my ($status, $no_seen, $data);
                while ($status = filter_read()) {
                        if (/^\s*no\s+$caller\s*;\s*$/) {
                                $no_seen=1;
                                last;
                        }
                        $data .= $_;
                        $_ = &quot;&quot;;
                }
                $_ = $data;
                s/BANG\s+BANG/die 'BANG' if \$BANG/g
                        unless $status &lt; 0;
                $_ .= &quot;no $class;\n&quot; if $no_seen;
                return 1;
            })
        }

        sub unimport {
            filter_del();
        }

        1 ;</pre>
<p>Given this level of complexity, it's perhaps not surprising that source
code filtering is not commonly used.</p>
<p>This RFC proposes that a new standard module -- Filter::Simple -- be
provided, to vastly simplify the task of source code filtering,
at least in common cases.</p>
<a name='The Filter::Simple module'></a><h2>The Filter::Simple module</h2>
<p>Instead of the above process, it is proposed that the Filter::Simple module
would simplify the creation of source code filters to the following
steps:</p>
<ul>
<li><a name='1.'></a>1.</li>
<p>Set up a module that does a <code>use Filter::Simple sub { ... }</code>.</p>
<li><a name='2.'></a>2.</li>
<p>Within the anonymous subroutine passed to <code>use Filter</code>, process the
contents of $_ to change the source code in the desired manner.</p>
</ul>
<p>In other words, the previous example, would become:</p>
<pre>        package BANG;
 
        use Filter::Simple sub {
            s/BANG\s+BANG/die 'BANG' if \$BANG/g;
        };

        1 ;</pre>
<a name='Module semantics'></a><h2>Module semantics</h2>
<p>This drastic simplication is achieved by having the standard
Filter::Simple module export into the package that <code>use</code>s it (e.g.
package &quot;BANG&quot; in the above example) two automagically constructed
subroutines -- <code>import</code> and <code>unimport</code> -- which take care of all the
nasty details.</p>
<p>In addition, the generated <code>import</code> subroutine passes its own argument
list to the filtering subroutine, so the BANG.pm filter could easily
be made parametric:</p>
<pre>        package BANG;

        use Filter::Simple sub {
            my ($die_msg, $var_name) = @_;
            s/BANG\s+BANG/die '$die_msg' if \${$var_name}/g;
        };

        # and in some user code:

        use BANG &quot;BOOM&quot;, &quot;BAM;  # &quot;BANG BANG&quot; becomes: die 'BOOM' if $BAM</pre>
<p>The specified filtering subroutine is called every time a <code>use BANG</code>
is encountered, and passed all the source code following that call,
up to either a <code>no BANG;</code> call or the end of the source file (whichever
occurs first).</p>
<a name='MIGRATION ISSUES'></a><h1>MIGRATION ISSUES</h1>
<p>None.</p>
<a name='IMPLEMENTATION'></a><h1>IMPLEMENTATION</h1>
<p>A prototype implementation is available from:</p>
<pre>        <a href='http://www.csse.monash.edu.au/~damian/CPAN/Filter-Simple.tar.gz' target='_blank'>www.csse.monash.edu.au</a></pre>
<p>and should soon appear on the CPAN.</p>
<p>This prototype requires the Filter::Util::Call module, but it is
proposed that a standard Filter::Simple module would be self-sufficient.</p>
<a name='NOTE'></a><h1>NOTE</h1>
<p>It is certainly not suggested that Filter::Simple should <i>replace</i>
Filter::Util::Call. That module provides much more flexible control
over source code filtering, and will still be needed in many cases.
It is merely proposed that simplified code filtering covering
the common cases ought to be incorporated in the core.</p>
<a name='REFERENCES'></a><h1>REFERENCES</h1>
<p><a href='http://search.cpan.org/search?dist=Filter' target='_blank'>search.cpan.org</a></p>
</div>
